The Signature Molecular Descriptor. 4. Canonizing Molecules Using Extended Valence Sequences

نویسندگان

  • Jean-Loup Faulon
  • Michael J. Collins
  • Robert D. Carr
چکیده

We present a new algorithm to canonize molecular graphs using the signature molecular descriptor introduced in the previous papers of this series. While developed specifically for molecular structures, the algorithm can be used for any graph and is not limited to acyclic graphs, planar graphs, bounded valence, or bounded genus graphs, for which polynomial time algorithms exist. The algorithm is tested with benzenoid hydrocarbons and a database of 126,705 organic compounds. The algorithm's performances are compared against Brendan Mc Kay's Nauty algorithm, which is believed to be the fastest graph canonization algorithm for general graphs, with five series of graphs each comprising up to 30,000 vertices: 2D meshes (pericondensed benzenoids), 3D cages (fullerenes and nanotubes), 3D meshes (crystal lattices), 4D cages, and power law graphs (protein and gene networks). The algorithm can be downloaded as an open source code at http://www.cs.sandia.gov/ approximately jfaulon/QSAR.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Signature Molecular Descriptor. 2. Enumerating Molecules from Their Extended Valence Sequences

We present a new algorithm that enumerates molecular structures matching a predefined extended valence sequence or signature. The algorithm can construct molecular structures composed of about 50 non-hydrogen atoms in CPU seconds time scale. The algorithm is run to produce all molecular structures matching the binding affinities (IC(50)) of some HIV-1 protease inhibitors. The algorithm is also ...

متن کامل

Molecular Study of Mycobaterium avium-intracellular Complex Strains

It is difficult to distinguish between clinically significant slowly-growing, non-pigmented mycobacteria, notably to separate M. aviumand M. intracellulare from one another and from M. scrofulaceum strains. The purpose of this study was to evaluate the extent to which 16S rRNA sequencing could be used to highlight the taxonomic relationships of the mycobacterial strains, which are difficult to ...

متن کامل

Biological Activity of Chemical Compounds and Their Molecular Structure-Information Approach

A method is proposed for bioscreening chemical compounds. We propose the systemic signs within the framework of an informational approach and the statistical method of compare of molecular qualitative characters. Using the methods of information theory, we offer four classification rules that allow statistically reliably distinguish preparations with high biological action (radioprotection). Fo...

متن کامل

ChemVassa: A New Method for Identifying Small Molecule Hits in Drug Discovery

ChemVassa, a new chemical structure search technology, was developed to allow rapid in silico screening of compounds for hit and hit-to-lead identification in drug development. It functions by using a novel type of molecular descriptor that examines, in part, the structure of the small molecule undergoing analysis, yielding its "information signature." This descriptor takes into account the ato...

متن کامل

Molecular Docking and QSAR Study of 2-Benzoxazolinone, Quinazoline and Diazocoumarin Derivatives as Anti-HIV-1 Agents

A series of 2-benzoxazolinone, diazocoumarin and quinazoline derivatives have been shown to inhibit HIV replication in cell culture. To understand the pharmacophore properties of selected molecules and design new anti-HIV agents, quantitative structure–activity relationship (QSAR) study was developed using a descriptor selection approach based on the stepwise method. Multiple linear regression ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of chemical information and computer sciences

دوره 44 2  شماره 

صفحات  -

تاریخ انتشار 2004